fix: forward warmup_ratio correctly to TrainingArguments by Neonkraft · Pull Request #21 · OpenEuroLLM/post-training

Neonkraft · 2026-04-29T14:05:42Z

Summary

Drop warmup_ratio from TrainingConfig and make warmup_steps a float. The original bug: warmup_ratio (e.g. 0.03) was being forwarded to warmup_steps (an int field in TrainingArguments), silently truncating to 0 and disabling warmup entirely. HuggingFace TrainingArguments already interprets a float value for warmup_steps as a ratio, so a single float field covers both use cases with no special handling needed.

Type of change

warmup_ratio (float 0.03) was passed to warmup_steps (int), truncating to 0 and silently disabling warmup entirely. Add a warmup_steps config field (default 0) and forward both fields to TrainingArguments so each reaches the correct parameter.

KonstiNik

Thanks for picking this up!

Tho, I'd argue we should follow HF's API more directly here. Maintaining warmup_ratio and warmup_steps as two separate fields for the same underlying concept doesn't really hold up — HF themselves consolidated this in 5.2 into a single warmup_steps: float where < 1 is interpreted as a ratio and >= 1 as absolute steps (training_args.py:788-793, training_args.py:2047-2054). One knob covers both use cases cleanly.

Concretely I suggest:

Drop warmup_ratio from TrainingConfig and rename to warmup_steps. The piece I think matters most: users need to be told what to put in this field, since accepting both ratios and absolute step counts is non-obvious from the name alone. Use the field(metadata=...) form (same pattern HF uses in TrainingArguments), and we can surface it in --help or auto-generated docs later:

warmup_steps: float = field(
    default=0.03,
    metadata={
        "help": (
            "Linear warmup duration. Values in [0, 1) are interpreted as a "
            "ratio of total training steps; values >= 1 are interpreted as an "
            "absolute number of steps; 0 disables warmup."
        )
    },
)

Forward only warmup_steps to TrainingArguments.
Add a small shim in PostTrainingConfig.load (config.py:270, between OmegaConf.load and the schema merge) so old YAMLs don't break:

if training is not None and "warmup_ratio" in training:
    logger.warning(
        "training.warmup_ratio is deprecated; use training.warmup_steps "
        "(values < 1 are interpreted as a ratio). Auto-migrating."
    )
    training.warmup_steps = training.pop("warmup_ratio")

(Needs import logging + a module-level logger — neither exists in config.py today.)

…at field Drop warmup_ratio; warmup_steps is now a float where values in [0, 1) are treated as a ratio and values >= 1 as absolute steps, matching HF's TrainingArguments behaviour. Adds a backward-compat shim that auto-migrates warmup_ratio in old YAMLs with a deprecation warning.

Neonkraft · 2026-04-30T14:18:27Z

Done.

KonstiNik

Refactor looks great. One small question: the default went from 0.03 to 0.0, so configs that didn't set warmup explicitly will now get no warmup instead of 3%. Was the default change intentional, or worth flipping back to 0.03 to keep the old behavior?

Neonkraft · 2026-04-30T15:22:28Z

Was the default change intentional, or worth flipping back to 0.03

It was intentional. TRL's default is 0.0 so I thought it best to mimic it.

Neonkraft requested a review from KonstiNik April 29, 2026 14:05

KonstiNik reviewed Apr 30, 2026

View reviewed changes

KonstiNik approved these changes May 1, 2026

View reviewed changes

Neonkraft merged commit 895eab0 into main May 1, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: forward warmup_ratio correctly to TrainingArguments#21

fix: forward warmup_ratio correctly to TrainingArguments#21
Neonkraft merged 2 commits intomainfrom
fix/warmup-steps

Neonkraft commented Apr 29, 2026

Uh oh!

KonstiNik left a comment

Uh oh!

Neonkraft commented Apr 30, 2026

Uh oh!

KonstiNik left a comment

Uh oh!

Neonkraft commented Apr 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Neonkraft commented Apr 29, 2026

Summary

Type of change

Uh oh!

KonstiNik left a comment

Choose a reason for hiding this comment

Uh oh!

Neonkraft commented Apr 30, 2026

Uh oh!

KonstiNik left a comment

Choose a reason for hiding this comment

Uh oh!

Neonkraft commented Apr 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants